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Abstract 

Boolean networks are discrete dynamical systems in which the state (zero or one) of each network 
node at time t is updated to a state determined by the states at time i — 1 of those nodes that 
have links to it. Boolean networks with 'canalizing' update rules have been of great interest in the 
modeling of genetic control. A canalizing update rule is one for which the node state at time t is 
determined by the state at time t — 1 of a particular one of its inputs when that input is in its 
canalizing state. In this paper, we introduce a generalized concept of canalization that we believe 
offers a significant enhancement in biological relevance, and we obtain a simple general network 
stability criterion for Boolean networks with generalized canalization for a broad class of network 
topologies. 
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Boolean networks have been extensively studied as a model for genetic control of cells 
111, In this framework, the genetic regulatory network is modeled as a directed graph, 
where links correspond to the influence of one gene on the expression of another. Individual 
genes are either off or on, represented as or 1, respectively, and the state of a gene at 
time t + 1 is given by a Boolean update function of the states of its inputs at time t. In 
early analyses, both the network topology and the update functions were assumed to be 
random. In particular, Kauffman's N — K network model has received significant 

study. According to this model, there are N nodes (genes) in the network, each having the 
same number of input links, and the nodes from which these input links originate are 
chosen randomly with uniform probability. Additionally, the update function determining 
the time evolution at each node is defined by a random, time- independent, 2^-entry truth 
table characterized by the 'bias' p, which, as discussed subsequently, is the probability that 
a one appears as the output of the update function. Using the Hamming distance between 
two states of the system (i.e., the number of nodes for which the states disagree) as the 
distance measure, these systems, when large, exhibit both 'chaotic' (or unstable) behavior, 
where the distance between typical initially close states on average grows in time, as well as 
stable behavior, where the distance decreases. Separating the two is a 'critical' regime. 

In Ref. j^, we developed an approximate technique for determining the stability of 
large Boolean networks. A key feature of this work was that it allowed one to investigate 
the effect of given specific network topology. Numerical experiments were performed {4] 



exploring such effects as correlation between 



assortativity 12|], community structure 



]he number of inputs and outputs at each node 
etc., and these experiments yielded results 



accurately predicted by the theory. 

So-called 'canalizing functions' are a significant modification of the random truth table 

n fi r 

model of previous work [14]. Canalizing functions, believed to be biologically relevant [5|, |6[, 
are those functions where an argument of the function (the 'canalizing input'), having a 
certain value (the 'canalizing value'), determines the value of the function independent of 

n 

the values of the other arguments (inputs) If the canalizing input does not have the 
canalizing value, the function is determined by the other inputs. (Further refinements can 



include a hierarchy of canalization 
would be unstable in their absence 



.~ Canalizing functions often stabilize networks that 



flkl. 



Shmulevich and Kauffman defined the 'activity' 



of a Boolean variable on a Boolean function, which can be used to quantify the increased 



importance of canalizing inputs. 

In this paper, we present a generalized model of canalizing behavior that we believe 
offers an enhancement in the biological relevance of Boolean network models. We use the 



Shmulevich-Kauffman activity to extend the results of Ref. j4| to the case of networks 
with canalizing functions. We derive an hypothesized condition under which such networks 
are stable, and we numerically test this criterion. A significant point is that our stability 
criterion applies to networks of very general topology. 

Boolean networks comprise a state vector S(t) = [ai{t)a2{t)...aN{t)]'^ , where each CTj G 
{0, 1}, and a set of update functions fi, such that 

^i{t) = fi{(^j{i,l){t - 1), (^j{i,2){t - 1), ...), (1) 

where 2), K^"-) denote the indices of the Kl"' nodes that input to node i, and 

we denote this set of nodes by J7i = k)\k = 1, 2, .., K-"}. (In the following discussion, k, 
which is between 1 and -R'*", is used to label an input to node i, or, equivalently an argument 
of fi] j, which is between 1 and A^, refers to the network index of the node corresponding 
to input k; k{i,j) and k) are used to switch between them. Similarly, aj is the state of 
node j, and Sk is the fc-th input to /j.) The number of input links fC*" to node i is called 
its in-degree, and the number of output links K°^^ from node i is called its out-degree. The 
update function fi at each node i is usually defined by a truth table, where the table for node 
i has 2^i" rows, one for each possible set of the states of the Kf^ nodes that input to node 
and each input state row is followed by its resulting update output state for node i, thus 
forming a 2^^" entry output column. The stability of a large Boolean network is defined by 
considering the trajectories resulting from two close initial states, and S(t). To quantify 
their divergence, the Hamming distance of coding theory is used: h{t) = J2iLi Wiit) ~<^i(t)\- 
If the network is stable, on average h{t) — as t — oo. In unstable networks, h{t) quickly 
increases to 0{N), while a 'critical' network is at the border separating stability and chaos. 

In order to study the stability of N — K Boolean networks, Derrida and Pomeau 
considered an annealing procedure and calculated the probability that, after t steps, a node 
state is the same on two trajectories that originated from initially close conditions. (Later 
authors generalized the Derrida-Pomeau analysis to include variable in-degree 15l4l7l| and 



joint in-degree/out-degree distributions |20|.) In their annealed situation, at each time step 



t the truth table outputs and the network of connections are randomly chosen. The actual 



situation of interest, however, is the case of 'frozen-in' networks, where the truth table 
and network of connections are fixed in time. It was hypothesized and later numerically 
confirmed that, for large networks, results obtained in the analytically tractable annealed 
situation are the same as those in the analytically intractable frozen situation. In deriving 



the results of Ref. j4|, we used a 'semiannealed' procedure in which the network connections 
were frozen, but truth table outputs were randomly chosen on each time step. Again, the 
aforementioned hypothesis is very well supported numerically. 



The semiannealing procedure used in Ref. j4| independently and randomly reassigned 
the output elements of the truth table governing node i to be one or zero with probability 
Pi or 1 — Pi, respectively. However, canalizing functions do not have this property: if the 
canalizing input takes its canalizing value in a given row of the table, the probability of a 
one appearing in the output row is zero (or one). We call this behavior 'strictly canalizing.' 
Considering all possible inputs to have equal probability, we now introduce a generalization 
of strictly canalizing behavior to the case of 'quasicanalizing' inputs, which we define as the 
case where the probability that a one appears in the output of node i's truth table if input k 
takes value s averaged over all other inputs, p^^'^"^ , depends on s. (The average of p^-^''^'* over 
both values of s, p* = {pf'''^^ +pf'^'^)/2 is the 'effective bias.' This quantity will be discussed 
in detail below.) Strict canalization with respect to input k, therefore, is the case when 
^(fc,s) _ g ]^ when s is the canalizing value. If pP'"'' = pf''^\ k is a non-canalizing input to 
i. We call truth tables where all inputs are non-canalizing 'unstructured,' and those with any 
canalizing inputs, strict or quasi-, 'structured.' Given our generalized definition of canalizing 
behavior, in the rest of this paper we formulate a modified semiannealing procedure, assume 
that this approximates the frozen-in case to derive the stability criterion, and numerically 
confirm that the stability criterion holds in the frozen-in case with structured truth tables. 

A crucial quantity in the theory of Ref. [J] is the 'sensitivity' of a node, which is the 
probability that any change to a node's inputs causes a change in the node's output. This 
quantity treats all nodes equally importantly, however, and this is clearly not appropriate 
in the case of canalizing functions. A remedy for this is to use the activity of input k 
on fi, Vik, which is the probability that the output of /« changes if only input k changes. 
Presupposing a mapping between the set of pf''^^ that describe fi and the activities r^fc and 
a suitable semiannealing procedure (both of which we derive below), we can now extend 



the procedure of Ref. 



J] to account for canalizing behavior in analyzing Boolean network 



stability. 

We define the A^-dimensional vector y(t), where each element yi(t) tracks the probability 
that node i differs between two initally close states after t time-steps: yi{t) = Pr[crj(t) 7^ 
5"j(t)]. Our goal is to derive an update equation for yi{t) and perform linear stability analysis 
on the solution yi{t) = 0. The update equation will be derived under the assumption that 
the inputs yj{t) are statistically independent of one another. This assumption holds in the 
case of locally tree-like topology j^, 10|. 



Since we are performing linear stability analysis, we can make several simplifying ap- 
proximations. The probability of d inputs to node i being different between the trajectories 
of two initially close states is of order 0{y'^{l — y^'""'^)) ~ 0{y'^). Since in linear stability 
y is taken to be small, the probability that only input node j to node i is flipped is ap- 
proximately yj(t), and the probability that this occurs and leads to a flip in the output of 
node i is rikyj(i^k){t)- Thus we get the following approximate evolution equation for small 
perturbations from the solution y(t) = 0: 



y,{t + 1) ^ ^r,fcy,(,,,)(t) + 0{y^). (2) 

k=l 

This can be written in matrix form after discarding the higher-order terms as 

y{t)-Ry{t-'^), (3) 

where R is the 'activity matrix' with elements Rij = if there is a link from j to i 
[k = k{i,j)), and zero otherwise. From this equation, we see that stability is determined by 
the largest eigenvalue Xr of this matrix: 

Xr > l,y = is unstable; 

Xr = l,y = is critical; (4) 
Xr < l,y = is stable. 

Before completing the details of the theory (i.e., specifying how to obtain r^j from {pf'''^''} 
and the network topology), we we present numerical results testing our derived criterion for 
the stability of Boolean networks with canalizing truth tables in Fig. 1. We consider two 
cases of canalization in the truth tables: (a) a varying proportion of nodes have a single, 
strictly canalizing input (open markers); and (b) each node has a single quasicanalizing 
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FIG. 1. Steady state Hamming distance vs. Xr for two different = 10'^ node networks: an 
N — K network where each node has K = 3 inputs (circles), and a network with truncated power 
law degree distribution with (K) = 3.08 (squares). Xr is varied in two ways: either by giving 
each node a single quasicanalizing input of increasing strength (filled markers), or by giving an 
increasing proportion of nodes a single, strictly canalizing input (open markers). The predicted 
transition is at Ar = 1 (dashed line). 



input of increasing strength (filled markers). We also test the criterion on two network 
topologies: one on a network with exactly K = 3 inputs and outputs per node (circles), 
and a network where the in- and out-degrees are drawn independently from a truncated 
power-law distribution: P{K) oc K^'^-^ if i^™" < K < K"^"-^, and otherwise. This 
yields a network with equal values of the in- and out-degree (K) ^ 3. Both networks have 
N = 10'^ nodes, and the largest eigenvalues of the adjacency matrices of both networks are 
approximately A ~ 3. All nodes in both networks have a uniform effective bias p* = 0.235 
for every data point. In the absence of canalization, Ref. jj] predicts that the networks 
would be slightly in the chaotic regime. Each datapoint in Fig. □ is the average steady-state 
Hamming distance measurement of 100 different frozen realizations of the truth tables. The 
steady-state Hamming distance is calculated as the average Hamming distance from time 
t = 90 to t = 100 between trajectories that have an initial Hamming distance of 10 (0.1 % 
of the nodes are flipped). 

In the first method of varying A/j, we increase the proportion of nodes that have a single 
canalizing input from zero canalizing inputs to each node having one canalizing input. For 
each node in the network, we choose whether the node will have a canalizing input with 



probability Pcan- When choosing a canahzing input c to node i, we wish to maximize the 
inpact on A^, so we choose c that has the minimum value of Kl^K^^. When pcan = 0, Ai? 
takes its maximum value; when pcan = 1, takes its minimum value. 

In the second method of varying Xji, where each node has a single quasicanalizing input, 
we choose the canalizing input c as above. When assigning a generalized canalized truth 
table, we randomly choose the canalizing value v to be zero or one with uniform probability. 
To vary Xr, we vary pf^''"^ from zero to p*: when pf'^^ — 0, all nodes have a single strictly 
canalizing input (i.e., it is identical to the case where Pcan — 1 above); when pf^'^^ — P*, c is 
no longer a canalizing input and the network is identical to the case where Pcan — above. 
A significant result from Figu. 1 is that while the two networks trace different curves, the 
two methods of varying X^i seem to noisily lie on the same curve. We see that the result 
from our stability criterion (the dashed line) appears to given an extremely good prediction 
of the transiton from the zero Hamming distance state (stability) . 

Having discussed the condition under which a semiannealed, canalizing Boolean network 
is stable and given a numerical test for it, we now return to the task of deriving the annealing 
procedure for the truth tables and, using those results, derive an expression for rij in terms 

(k s) 

of the p\ ■ We define the appropriate annealing procedure used on the truth tables by 
specifying the probability that a given output value of fi is one. A useful quantity in the 
following analysis is the 'effective bias' p* , which is the probability that any truth table 
output is one, similar to the unstructured case. Letting L = 2^*" be the number of rows in 
the truth table, the expected number of ones in the output column of the truth table of node 
i is p*L. For any given arbitrary input k to node i, L/2 rows in the truth table have Sk = 
and L/2 have Sk = 1. By definition of pf''^^ , the expected number of ones in the output 
of the truth table with entries that have Sfc = s is p^^'^^L/2. The total expected number of 
ones is the sum of the expected number of ones when = and when = 1, which leads 
to 

(fc,0) , (fc,l) 

* Pi +Pi /r\ 

Pi = - — (5) 
Note that, since the expected number of ones does not depend on our choice of k above, 
pf'^"* + Pi*^'^^ must be independent of k. This provides a constraint on the set of possible 
pf'^^ values that describe a reahzable truth table; the full set of constraints will not be 
needed for what follows and will be discussed in a foUowup paper. Non-canalizing inputs 
have both and pf'^^ equal to the effective bias by definition, and unstructured truth 
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tables have pi = p*. 

We now derive the probabihty that a given set of input values to node i, {si, S2, s^»n}, 
yields an output of one, and we denote this probability 0i(si, s^™) = Pr[/i = ...Ij^in]. 
Using Bayes' Theorem, we have 

Pr|A..../^.,.|i; = l]Pr|j; = l] 

--'Kr) Pr|/„.../^..] • w 

where is the event that the k-th input takes the value Sk (i.e., Ik is the event that aj = Sk, 
where Sk denotes a specifc value, or 1, of the node /c)'s state variable crj). By definition, 
Pr[/j = 1] = p*. Since we are considering an ensemble where every possible input string to 
fi has equal probability, Pr[Ji, .../^m] = 2~^i" . We note that since each of the events Ik are 
independent, Pr[/i, .../;^in|/j = 1] = HfeP^I-^fcl/i = 1] ^'^^ calculate Pr[Jfc|/i = 1] again 
using Bayes' Theorem: 

Pr[/i = 1] p* 

Using these results in Eq. ([6]), we obtain 

k=X 

Thus, our new semiannealing procedure, generalized to include canalization, randomly re- 
assigns each output element of the truth table of all nodes at each time according to the 
probability given by Eq. ([8]). 

Using Eq. ([8]), we can calculate the activity of input k on /j, rjjt, using the definition 
that it is the probability that the output of fi changes if only input k changes. We define 
= 0j(si, Sfc_i, s, Sfc+i, ...) to be a Kf^ — 1 input function that denotes the probability 
that the truth table output is one if input k is s given some Kf^ — 1 element set of other 
inputs. With this, we calculate the activity as 

r.. = (0f'°^(l-#^)) + 0f'^)(l-#°^)), (9) 

where (■) is the average over all Kf^ — 1 states s^/ for k' ^ k. 

This completes our derivation of our result for assigning an R matrix to a given Boolean 
network topology and specification of {pf'^''}- It is this result, along with Eq. (jl]), that we 
have used in obtaining the stability threshold plotted as the dashed line in Fig. 1. 



In this paper we have presented a probabihstic generahzation to canahzed functions. Our 
generahzation allows a continuum in the degree of canalization, as opposed to the previous 
model j^l where an input could only be strictly canalizing or not canalizing at all. We believe 
that our generalized canalization model could be of enhanced relevance to gene networks. 
We used this generalized canalization model to define a semiannealing procedure where the 
update functions of every node in the network are randomly reassigned at each time step, 
but the network of connections (i.e., the network topology) remained frozen. We employed 
this semiannealing situation, along with the supposition that it yields results for the frozen 
case, to derive the condition under which Boolean networks that have canalizing functions 
are stable in Eq. (jl]), and we numerically confirmed our supposition. Given the likely 
prominance of canalizing behavior in gene networks, these results may offer significant input 
into the understanding of these systems. Furthermore, since our results allow analysis of 
any specified network (e.g., an experimentally determined network), our stability criterion 
may eventually, with advances in gene network measurement techniques, allow one to assess 
the criticality of real genetic networks. 

We thank Wolfgang Losert for his comments. This work was partially supported by ONR 
Grant N00014-07-1-0734. 
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